catastrophic risk
How California's New AI Law Protects Whistleblowers
Booth is a reporter at TIME. Governor Gavin Newsom speaks at Google about preparing students and workers for the next generation of technology, in San Francisco, California, on August 7, 2025. Governor Gavin Newsom speaks at Google about preparing students and workers for the next generation of technology, in San Francisco, California, on August 7, 2025. Booth is a reporter at TIME. CEOs of the companies racing to build smarter AI--Google DeepMind, OpenAI, xAI, and Anthropic--have been clear about the stakes.
- Law (1.00)
- Government (0.73)
Quantifying Risks in Multi-turn Conversation with Large Language Models
Wang, Chengxiao, Chaudhary, Isha, Hu, Qian, Ruan, Weitong, Gupta, Rahul, Singh, Gagandeep
Large Language Models (LLMs) can produce catastrophic responses in conversational settings that pose serious risks to public safety and security. Existing evaluations often fail to fully reveal these vulnerabilities because they rely on fixed attack prompt sequences, lack statistical guarantees, and do not scale to the vast space of multi-turn conversations. In this work, we propose QRLLM, a novel, principled Certification framework for Catastrophic risks in multi-turn Conversation for LLMs that bounds the probability of an LLM generating catastrophic responses under multi-turn conversation distributions with statistical guarantees. We model multi-turn conversations as probability distributions over query sequences, represented by a Markov process on a query graph whose edges encode semantic similarity to capture realistic conversational flow, and quantify catastrophic risks using confidence intervals. We define several inexpensive and practical distributions: random node, graph path, adaptive with rejection. Our results demonstrate that these distributions can reveal substantial catastrophic risks in frontier models, with certified lower bounds as high as 70\% for the worst model, highlighting the urgent need for improved safety training strategies in frontier LLMs.
- North America > United States > California (0.14)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- (2 more...)
If Anyone Builds it, Everyone Dies review – how AI could kill us all
W hat if I told you I could stop you worrying about climate change, and all you had to do was read one book? Great, you'd say, until I mentioned that the reason you'd stop worrying was because the book says our species only has a few years before it's wiped out by superintelligent AI anyway. We don't know what form this extinction will take exactly - perhaps an energy-hungry AI will let the millions of fusion power stations it has built run hot, boiling the oceans. Maybe it will want to reconfigure the atoms in our bodies into something more useful. There are many possibilities, almost all of them bad, say Eliezer Yudkowsky and Nate Soares in If Anyone Builds It, Everyone Dies, and who knows which will come true.
- North America > United States (0.17)
- Oceania > Australia (0.05)
- North America > Mexico (0.05)
- Europe > Ukraine > Kyiv Oblast > Chernobyl (0.05)
- Government (1.00)
- Leisure & Entertainment > Sports (0.71)
- Health & Medicine (0.67)
Dimensional Characterization and Pathway Modeling for Catastrophic AI Risks
Although discourse around the risks of Artificial Intelligence (AI) has grown, it often lacks a comprehensive, multidimensional framework, and concrete causal pathways mapping hazard to harm. This paper aims to bridge this gap by examining six commonly discussed AI catastrophic risks: CBRN, cyber offense, sudden loss of control, gradual loss of control, environmental risk, and geopolitical risk. First, we characterize these risks across seven key dimensions, namely intent, competency, entity, polarity, linearity, reach, and order. Next, we conduct risk pathway modeling by mapping step-by-step progressions from the initial hazard to the resulting harms. The dimensional approach supports systematic risk identification and generalizable mitigation strategies, while risk pathway models help identify scenario-specific interventions. Together, these methods offer a more structured and actionable foundation for managing catastrophic AI risks across the value chain.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
- South America > Chile (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- (18 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Against racing to AGI: Cooperation, deterrence, and catastrophic risks
Dung, Leonard, Hellrigel-Holderbaum, Max
AGI Racing is the view that it is in the self-interest of major actors in AI development, especially powerful nations, to accelerate their frontier AI development to build highly capable AI, especially artificial general intelligence (AGI), before competitors have a chance. We argue against AGI Racing. First, the downsides of racing to AGI are much higher than portrayed by this view. Racing to AGI would substantially increase catastrophic risks from AI, including nuclear instability, and undermine the prospects of technical AI safety research to be effective. Second, the expected benefits of racing may be lower than proponents of AGI Racing hold. In particular, it is questionable whether winning the race enables complete domination over losers. Third, international cooperation and coordination, and perhaps carefully crafted deterrence measures, constitute viable alternatives to racing to AGI which have much smaller risks and promise to deliver most of the benefits that racing to AGI is supposed to provide. Hence, racing to AGI is not in anyone's self-interest as other actions, particularly incentivizing and seeking international cooperation around AI issues, are preferable.
- Asia > China (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States > New York > Monroe County > Rochester (0.04)
- (2 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government > Military (1.00)
Military AI Cyber Agents (MAICAs) Constitute a Global Threat to Critical Infrastructure
This paper argues that autonomous AI cyber-weapons - Military-AI Cyber Agents (MAICAs) - create a credible pathway to catastrophic risk. It sets out the technical feasibility of MAICAs, explains why geopolitics and the nature of cyberspace make MAICAs a catastrophic risk, and proposes political, defensive-AI and analogue-resilience measures to blunt the threat.
- North America > United States (0.46)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- (5 more...)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (1.00)
What Is AI Safety? What Do We Want It to Be?
Harding, Jacqueline, Kirk-Giannini, Cameron Domenico
The field of AI safety seeks to prevent or reduce the harms caused by AI systems. A simple and appealing account of what is distinctive of AI safety as a field holds that this feature is constitutive: a research project falls within the purview of AI safety just in case it aims to prevent or reduce the harms caused by AI systems. Call this appealingly simple account The Safety Conception of AI safety. Despite its simplicity and appeal, we argue that The Safety Conception is in tension with at least two trends in the ways AI safety researchers and organizations think and talk about AI safety: first, a tendency to characterize the goal of AI safety research in terms of catastrophic risks from future systems; second, the increasingly popular idea that AI safety can be thought of as a branch of safety engineering. Adopting the methodology of conceptual engineering, we argue that these trends are unfortunate: when we consider what concept of AI safety it would be best to have, there are compelling reasons to think that The Safety Conception is the answer. Descriptively, The Safety Conception allows us to see how work on topics that have historically been treated as central to the field of AI safety is continuous with work on topics that have historically been treated as more marginal, like bias, misinformation, and privacy. Normatively, taking The Safety Conception seriously means approaching all efforts to prevent or mitigate harms from AI systems based on their merits rather than drawing arbitrary distinctions between them.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- North America > United States (0.04)
- Africa > Eswatini > Manzini > Manzini (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- Media (0.66)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Threshold Crossings as Tail Events for Catastrophic AI Risk
We analyse circumstances in which bifurcation-driven jumps in AI systems are associated with emergent heavy-tailed outcome distributions. By analysing how a control parameter's random fluctuations near a catastrophic threshold generate extreme outcomes, we demonstrate in what circumstances the probability of a sudden, large-scale, transition aligns closely with the tail probability of the resulting damage distribution. Our results contribute to research in monitoring, mitigation and control of AI systems when seeking to manage potentially catastrophic AI risk.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- North America > United States > California > San Francisco County > San Francisco (0.04)
Inside France's Effort to Shape the Global AI Conversation
One evening early last year, Anne Bouverot was putting the finishing touches on a report when she received an urgent phone call. It was one of French President Emmanuel Macron's aides offering her the role as his special envoy on artificial intelligence. The unpaid position would entail leading the preparations for the France AI Action Summit--a gathering where heads of state, technology CEOs, and civil society representatives will seek to chart a course for AI's future. Set to take place on Feb. 10 and 11 at the presidential Élysée Palace in Paris, it will be the first such gathering since the virtual Seoul AI Summit in May--and the first in-person meeting since November 2023, when world leaders descended on Bletchley Park for the U.K.'s inaugural AI Safety Summit. After weighing the offer, Bouverot, who was at the time the co-chair of France's AI Commission, accepted. But France's Summit won't be like the others.
- Europe > France (1.00)
- Europe > United Kingdom > England > Buckinghamshire > Milton Keynes (0.25)
- Asia > South Korea > Seoul > Seoul (0.25)
- (4 more...)
- Research Report (0.46)
- Personal (0.46)
Why do Experts Disagree on Existential Risk and P(doom)? A Survey of AI Experts
The development of artificial general intelligence (AGI) is likely to be one of humanity's most consequential technological advancements. Leading AI labs and scientists have called for the global prioritization of AI safety citing existential risks comparable to nuclear war. However, research on catastrophic risks and AI alignment is often met with skepticism, even by experts. Furthermore, online debate over the existential risk of AI has begun to turn tribal (e.g. name-calling such as "doomer" or "accelerationist"). Until now, no systematic study has explored the patterns of belief and the levels of familiarity with AI safety concepts among experts. I surveyed 111 AI experts on their familiarity with AI safety concepts, key objections to AI safety, and reactions to safety arguments. My findings reveal that AI experts cluster into two viewpoints -- an "AI as controllable tool" and an "AI as uncontrollable agent" perspective -- diverging in beliefs toward the importance of AI safety. While most experts (78%) agreed or strongly agreed that "technical AI researchers should be concerned about catastrophic risks", many were unfamiliar with specific AI safety concepts. For example, only 21% of surveyed experts had heard of "instrumental convergence," a fundamental concept in AI safety predicting that advanced AI systems will tend to pursue common sub-goals (such as self-preservation). The least concerned participants were the least familiar with concepts like this, suggesting that effective communication of AI safety should begin with establishing clear conceptual foundations in the field.
- South America > Brazil > Rio de Janeiro > South Atlantic Ocean (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Hawaii (0.04)
- (3 more...)
- Questionnaire & Opinion Survey (1.00)
- Overview (1.00)
- Research Report > Experimental Study (0.68)
- Government > Military (0.48)
- Information Technology > Security & Privacy (0.46)
- Education (0.46)